Integrated Bio-Entity Network: A System for Biological Knowledge Discovery

نویسندگان

  • Lindsey Bell
  • Rajesh Chowdhary
  • Jun S. Liu
  • Xufeng Niu
  • Jinfeng Zhang
چکیده

A significant part of our biological knowledge is centered on relationships between biological entities (bio-entities) such as proteins, genes, small molecules, pathways, gene ontology (GO) terms and diseases. Accumulated at an increasing speed, the information on bio-entity relationships is archived in different forms at scattered places. Most of such information is buried in scientific literature as unstructured text. Organizing heterogeneous information in a structured form not only facilitates study of biological systems using integrative approaches, but also allows discovery of new knowledge in an automatic and systematic way. In this study, we performed a large scale integration of bio-entity relationship information from both databases containing manually annotated, structured information and automatic information extraction of unstructured text in scientific literature. The relationship information we integrated in this study includes protein-protein interactions, protein/gene regulations, protein-small molecule interactions, protein-GO relationships, protein-pathway relationships, and pathway-disease relationships. The relationship information is organized in a graph data structure, named integrated bio-entity network (IBN), where the vertices are the bio-entities and edges represent their relationships. Under this framework, graph theoretic algorithms can be designed to perform various knowledge discovery tasks. We designed breadth-first search with pruning (BFSP) and most probable path (MPP) algorithms to automatically generate hypotheses--the indirect relationships with high probabilities in the network. We show that IBN can be used to generate plausible hypotheses, which not only help to better understand the complex interactions in biological systems, but also provide guidance for experimental designs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bio-TDS: bioscience query tool discovery system

Bioinformatics and computational biology play a critical role in bioscience and biomedical research. As researchers design their experimental projects, one major challenge is to find the most relevant bioinformatics toolkits that will lead to new knowledge discovery from their data. The Bio-TDS (Bioscience Query Tool Discovery Systems, http://biotds.org/) has been developed to assist researcher...

متن کامل

Designing an Ontology for Knowledge Discovery in Iran’s Vaccine

Ontology is a requirement engineering product and the key to knowledge discovery. It includes the terminology to describe a set of facts, assumptions, and relations with which the detailed meanings of vocabularies among communities can be determined. This is a qualitative content analysis research. This study has made use of ontology for the first time to discover the knowledge of vaccine in Ir...

متن کامل

Cluster Based Cross Layer Intelligent Service Discovery for Mobile Ad-Hoc Networks

The ability to discover services in Mobile Ad hoc Network (MANET) is a major prerequisite. Cluster basedcross layer intelligent service discovery for MANET (CBISD) is cluster based architecture, caching ofsemantic details of services and intelligent forwarding using network layer mechanisms. The cluster basedarchitecture using semantic knowledge provides scalability and accuracy. Also, the mini...

متن کامل

The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery

The Semanticscience Integrated Ontology (SIO) is an ontology to facilitate biomedical knowledge discovery. SIO features a simple upper level comprised of essential types and relations for the rich description of arbitrary (real, hypothesized, virtual, fictional) objects, processes and their attributes. SIO specifies simple design patterns to describe and associate qualities, capabilities, funct...

متن کامل

JXP4BIGI: a generalized, Java XML-based approach for biological information gathering and integration

MOTIVATION In the post-genomic era, biologists interested in systems biology often need to import data from public databases and construct their own system-specific or subject-oriented databases to support their complex analysis and knowledge discovery. To facilitate the analysis and data processing, customized and centralized databases are often created by extracting and integrating heterogene...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2011